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Abstract. To improve confounder adjustments, observational studies 
are often matched on potential confounders. While matched case-control 
studies are common and well covered in the literature, our focus here 
is on matched cohort studies, which are less common and sparsely dis- 
cussed in the literature. Matched data also arise naturally in twin stud- 
cohort of exposure-discordant twins can be viewed as being 
matched on a large number of potential confounders. The analysis of 
twin studies will be given special attention. We give an overview of vari- 
ous analysis methods for matched cohort studies with binary exposures 
and binary outcomes. In particular, our aim is to answer the following 
questions: (1) What are the target parameters in the common analysis 
methods? (2) What are the underlying assumptions in these methods? 
(3) How do the methods compare in terms of statistical power? 
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1. INTRODUCTION 

A common goal of epidemiological research is to 
estimate the causal effect of a particular exposure 
on a particular outcome. The common tool is an 
observational study, utilizing, for example, hospital 
data, cohort data or health register data. In observa- 
tional studies, the exposure-outcome association is 
invariably confounded by factors that induce spuri- 
ous (i.e., noncausal) associations. For example, age 
may confound an exposure-outcome association if 
older people are more often exposed and more likely 
to develop the outcome. Without adjustment for 
age, that is, if the confounding influence by age is not 
accounted for in the analysis, there may be an asso- 
ciation of exposure and outcome, even in the absence 
of a causal effect. Hence, the exposure-outcome asso- 
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ciation cannot, in general, be given a causal interpre- 
tation, unless all confounders are properly adjusted 
for. 

There are several strategies to adjust for potential 
confounders in the analysis, for example, stratifica- 
tion or regression modeling. Essentially, these meth- 
ods solve the problem of confounding by comparing 
the exposed and unexposed within levels of the con- 
founders, thus balancing the confounders across lev- 
els of the exposure and comparing "like with like." If 
there is a strong association between the confound- 
ers and the exposure, or between the confounders 
and the outcome, these strategies are often ineffi- 
cient. In particular, some strata may contain few 
exposed subjects or few cases (i.e., subjects that de- 
veloped the outcome); the lack of balance may lead 
to unstable estimates for these strata. 

One common method to increase the efficiency is 
to match the study on potential confounders. For ex- 
ample, matched case-control studies are constructed 
so that for each fixed number of controls 

are selected, having the same confounder levels as 
the case. When each case is matched to one con- 
trol, we say that the study is 1:1 matched. In case- 
control studies, matching forces the ratio of cases 
to controls to be constant across all strata of the 
matched factors, which implies that the association 
between the confounders and the outcome is broken. 
Matched case-control studies are commonplace, and 
well covered in the literature (e.g., Breslow and Day, 
1980; Jewell, 2004; Woodward, 2005). A matched 
cohort study can be constructed in a similar fash- 
ion; for each exposed subject, a fixed number of 
unexposed subjects are selected, having the same 
confounder levels as the exposed. In cohort studies, 
matching forces the ratio of exposed to unexposed 
to be constant across all strata of the matched fac- 
tors, which implies that the association between the 
confounders and the exposure is broken. Matched 
cohort studies are relatively rare, and the literature 
is sparse and typically rather brief (e.g., Cummings 
et al., 2003). The reason, we believe, is mainly due to 
available data sources. Matched cohort studies are 
suitable for situations where a researcher has access 
to large population data sources with exposure in- 
formation. 

Matched data also arise naturally in twin studies. 
By nature, a large number of potential confounders 
are shared (i.e., having constant levels) within each 
twin pair, for example, genetic factors, maternal uter- 
ine environment, gestational age, etc. It follows that 
a cohort of exposure-discordant twin pairs (i.e., pairs 



in which one of the twins is exposed, and the other 
twin is unexposed) can be viewed as being 1 : 1 matched 
on all shared confounders. In such a cohort there is 
no association between the shared confounders and 
the exposure. An attractive feature of twin studies 
is that the shared confounders often include factors 
which are normally very difficult to match on, or 
even to measure. For example, monozygotic twins 
have identical genes and can thus can be viewed as 
being matched on the whole genome. However, a twin 
study is not simply a special case of a regular 1:1 
matched cohort study; whereas the latter only con- 
tains exposure-discordant pairs, the former also con- 
tains pairs which are concordant in the exposure. Be- 
cause of their unique and attractive properties, twin 
studies will be given special attention in this paper. 

The aim of this paper is to give a detailed overview 
of different analysis methods for matched cohort 
studies with binary exposures and binary outcomes. 
In particular, our aim is to answer the following 
questions: (1) What are the target parameters in the 
common analysis methods? (2) What are the under- 
lying assumptions in these methods? (3) How do the 
methods compare in terms of statistical power? 

We illustrate the methods with two examples. The 
first example is a register-based study on the ef- 
fect of hysterectomy on the risk for cardiovascular 
disease (CVD) in Swedish women (Ingelsson et al., 
2010). The study is matched on birth year, year of 
hysterectomy and county of residence at year of hys- 
terectomy, so that for each hysterectomized woman 
(exposed), three nonhysterectomized women at same 
age and year were selected from the general popu- 
lation. The second study is a population-based twin 
study of the association between fetal growth and 
childhood asthma (Ortqvist et al., 2009). 

The paper is organized as follows. In Section 2 
we review the concepts of marginalization, condi- 
tioning and standardization. In Section 3 we define 
a matched cohort study. In Section 4 we describe 
the most common analysis methods for matched co- 
horts. These methods can also be used to analyze 
the exposure-discordant pairs in twin studies. In 
Section 5 we demonstrate how these methods can 
be adapted for inclusion of the exposure-concordant 
pairs in twin studies as well. In Section 6 we carry 
out a simulation study. In Section 7 we provide the 
two illustrating examples. We will restrict our atten- 
tion to 1:1 matching, and we will not consider ad- 
ditional covariate adjustments. Extensions to other 
matching schemes and adjustments for additional 
covariates are discussed in Section 8. 
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2. MARGINALIZATION, CONDITIONING 
AND STANDARDIZATION 

We first establish the notations and briefly review 
the concepts of marginalization, conditioning and 
standardization, which are crucial for the under- 
standing of matching and confounder adjustment. 
More thorough discussions can be found in standard 
epidemiological textbooks (e.g., Rothman et al., 
2008). Let X denote the binary exposure of interest 
(0/1), let Y denote the binary outcome of interest 
(0/1) and let Z denote a set of potential confounders 
for the association between X and Y. We use Pr(-) 
generically for both probabilities (population pro- 
portions) and densities, and we use E(-) for expected 
value (population average). We use V\ _L V2IV3 as 
shorthand for U V\ and V2 conditionally independent, 
given V3." We use (log) odds ratios to quantify the 
X—Y association. Other possible options would be 
risk differences or risk ratios. There are two reasons 
for focusing on odds ratios. First, regression mod- 
els for odds ratios can be conveniently fitted with- 
out restrictions; see Section 4.1.1. Second, in applied 
scenarios, it is often desirable to make results com- 
parable with case control studies, in which only odds 
ratios are estimable. 

An unadjusted analysis targets the marginal 
(over Z) association between X and Y, for exam- 
ple, through the marginal odds ratio 

_ Pr(Y = l\X = 1) Pr(Y = 0\X = 0) 
m ~ Pr(Y = 0\X = 1) Pr(Y = 1\X = 0) ' 

We define ip m = log (OR m ). In the presence of con- 
founders Z, OR m fails to have a causal interpreta- 
tion. In particular, it may differ from 1 in the ab- 
sence of a causal effect. 

The influence of Z can be eliminated by condi- 
tioning on Z, as in the conditional odds ratio 

= Pr(Y = l|A = l,Z)Pr(Y = 0|A = 0,Z) 
c{ ' p r (Y = 0|X = l,Z)Pr(Y = 1\X = 0,Z)' 

The conditional odds ratio OR c (Z) depends, in gen- 
eral, on Z. If Z is the only confounder for the X—Y 
association, then OR c {Z) can be interpreted as the 
conditional causal effect of X on Y, given Z, on the 
odds ratio scale. If there are additional confounders, 
then OR c (Z) has no causal interpretation. 

OR c (Z) is a subpopulation (i.e., Z-specific) effect. 
The effect for the whole population can be obtained 
through standardization. The standardized proba- 
bility of Y = 1 given X = x, is given by 

(1) E z {Pv(Y = l\X = x,Z)}, 



where we have used subindex Z to highlight that 
the expectation is taken over the marginal distri- 
bution Pr(Z). We emphasize that the expression 
in (1) is not, in general, equal to E z ^x= x {^ r 0^ = 
1\X = x,Z)\X = x} = Pr(Y = 1\X = x), which is 
the marginal (unadjusted) probability of Y = 1, given 
X = x. If Z is the only confounder, then Ez{Pt(Y = 
1\X = x,Z)} can be interpreted as the hypotheti- 
cal (counterfactual) probability of Y = 1 , had every- 
body attained level X = x in the source population 
(Hernan and Robins, 2006). Pr(Y = 1\X = x, Z) can 
be standardized to any proper distribution Pr*(Z), 
not necessarily equal to Pr(Z). We let E Z (V) denote 
the expected value of V, where the expectation is 
taken over Pr*(Z). If Z is the only confounder, then 
E* z {Pi(Y = 1\X = x,Z)} can be interpreted as the 
hypothetical (counterfactual) probability of Y = 1, 
had everybody attained level X = x in the fictitious 
population where Z follows the distribution Pr*(Z). 
A standardized odds ratio is constructed as 

OR s 

_ E{Pr(Y = l\X = 1, Z)}E{Pv(Y = 0\X = 0, Z)} 
~ E{Pv(Y = 0\X = 1, Z)}E{Pr(Y = 1\X = 0, Z)} ' 

We define ip s = log(OR s ). In (1), Pr(Y~ = 1\X = x, Z) 
is standardized to Pr(Z), that is, the distribution 
of Z in the source population. In order to keep the 
notation simple, we use OR s and ip a , even if Pr(Z) 
is replaced by Pr*(Z), and we let it be clear from 
the context which distribution of Z these parame- 
ters are standardized to. If Z is the only confounder, 
then OR s can be interpreted as the causal effect of X 
on Y in the source/fictitious population, on the odds 
ratio scale. We emphasize that although the numer- 
ical values of OR s and tp s may depend heavily on 
which distribution of Z they are standardized to, 
they are always, by construction, adjusted for Z. 

In general, there is no ordering in the magnitudes 
of OR c (Z), and OR s . An interesting special case 
occurs when OR c (Z) is constant across levels of Z, 
that is, 

(2) log{ OR c (Z)} = ^ c . 

It can be shown (Neuhaus et al., 1991) that \ip c \ > 

In general, there is no ordering in the magnitudes 
of OR m and OR c (Z), or of OR m and OR s ; con- 
founding by Z can both inflate or deflate the associa- 
tion between X and Y. There are a few special cases 
though. If Y ± Z\X, then Pr(Y = 1\X, Z) = Pi{Y = 
l\X) which implies that OR m = OR c (Z) = OR s for 
all Z and all standardization distributions Pr*(Z). 
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X > Y 

Fig. 1. A causal structure for which Y _L Z|X . 

Z 




X > Y 

Fig. 2. i4 causal structure for which X _L Z. 

This would happen if the true causal structure be- 
tween X, Y and Z is as in Figure 1. If X _L Z, then 
Vt(Z\X) = Pr(Z) which implies that OR m = OR s 
for the particular distribution Pr(Z), that is, the dis- 
tribution of Z in the source population. This would 
happen if the true causal structure is as in Figure 2. 

We note that in Figures 1 and 2, Z is not a con- 
founder, and OR m can be given a causal interpreta- 
tion. Thus, for these scenarios, adjusting for Z is not 
necessary for causal inference. We further note that 
the structure in Figure 2 does not render OR m equal 
to OR c (Z), even if OR c (Z) is constant across levels 
of Z. This is a consequence of the noncollapsibility 
of the odds ratio. For a more thorough discussion on 
(non)collapsibility and the special properties of odds 
ratios, we refer the reader to Greenland et al. (1999). 

3. MATCHED COHORT STUDIES 
3.1 Design 

A cohort study that is 1:1 matched on Z con- 
sists of n pairs of observations, each pair consisting 
of one exposed subject {X = 1) and one unexposed 
subject (X = 0). The pairs are constructed so that 
the two subjects within each pair have the same level 



of confounder Z\ that is, Z may vary between pairs, 
but not within pairs. Thus, Z is equally distributed 
among exposed and unexposed in the matched co- 
hort. The outcome Y is assumed to be recorded for 
each subject. Ignoring Z, the paired data can be 
conveniently represented as in Table 1. In practice, 
1:1 matched pairs are typically constructed by first 
drawing an exposed person from the whole popu- 
lation, then drawing an unexposed person with an 
equal or similar level of confounder Z; we refer to 
this sampling scheme as exposure-driven matching. 

We note that in twin studies Z is not directly 
observed, but should be interpreted as all the unob- 
served factors that are common within a twin pair. 

3.2 Likelihood Construction 

Before discussing the various analysis methods, 
we construct the likelihood for the observed data. 
Let Zi denote the common value of Z for pair i, 
i G {1,2, ...,n}. Let Y® and denote the out- 
come Y for the unexposed (X = 0) and the exposed 
(X = 1) subject in pair i, respectively. The matched 
data consists of n i.i.d. observations (Y®, Y^,Zi). We 
suppress the index i when not needed, so that Y x de- 
notes Y for the subject with X = x, x £ (0, 1), within 
an arbitrary pair. We use Pr(Y = y, X = x, Z = z) to 
denote the population probability of (Y = y, X = x, 
Z = z), and we will use Pr*(Y° = y® ,Y l = y 1 , Z = z) 
to denote the probability for (Y° = y° ' ,Y l =y 1 , Z = z) 
induced by the matched sampling scheme. Under 
exposure-driven matching, the design implies that 

(3a) Pr*(Y x =y x \Z)=Pr(Y = y x \X = x,Z) 

and 

= y°,Y 1 =y 1 \Z) 

(3b) 

= Pr*(Y° = y°\Z) Pr^F 1 = y l \Z). 

Equation (3a) "ties" the induced distribution to the 
source population distribution, thus allowing for sam- 
ples from the former to be used for inference on 



Table 1 

Crude summary of matched 1:1 cohort data 



Unexposed pair member (X = 0) Totals 
Event (Y = 1) No event (Y = 0) 

Exposed pair member (X = 1) 

Event (V = 1) T U T + U 

No event (Y = 0) V W V + W 

Totals T + V U + W n 
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the latter. Equation (3b) determines the correlation 
structure of the data, which is crucial for correct 
standard error computations. In twin studies, (3a) 
and (3b) do not necessarily hold (see Section 5), but 
are assumed throughout the paper. 

The induced marginal distribution of Z is deter- 
mined by the type of matching. Under exposure- 
driven matching, the induced marginal distribution 
of Z equals the source population distribution of Z 
among the exposed, that is, Pi*(Z) = Pi{Z\X = 1). 
In twin studies restricted to the exposure-discordant 
pairs, we have that Pr*(Z) = Pr(Z| discordant in X). 

When Z is observed (as in regular matched stud- 
ies), the likelihood contribution for pair % is 

Pr^Y^y?,^^) 
i 

= 1[Pt(Y = yf\X = x,Z i )Pv*(Z i ), 
x=0 

so that the likelihood for the whole data set becomes 
equal to 

n 1 

[] [] Pr(Y = yf\X = x, Zi) Vi*{Zi). 

i=lx=0 

When Z is unobserved (as in twin studies) , the like- 
lihood contribution for pair i is 

^{Pr*(YP = y?,Y/= y ^)} 

= EUl[Pr(Y = yf\X = x,Z i )\, 
U=o J 

so that the the likelihood for the whole data set 
becomes equal to 

f[E*Jl[Pr(Y = y?\X = x,zA. 
i=\ 1^=0 J 

We note that marginally (over Z), Y° and Y 1 are as- 
sociated through the common value of Z; the strong- 
er conditional association between Y and Z, giv- 
en X, the stronger marginal association between Y° 
and Y 1 . 

4. ANALYSIS METHODS 

In this section we describe and compare the most 
common analysis methods for matched cohorts. We 
emphasize that all these methods can in principle 
be used to analyze the exposure-discordant pairs in 
twin studies as well. However, the explicit regression 
model (Section 4.1) requires Z to be observed, which 
is typically not the case in twin studies. 



4.1 Regression Model Explicitly Involving Z 

A straightforward way to adjust for Z is to fit a re- 
gression model for Y , given X and Z, for example, 

(4) logit{Pr(Y = 1\X, Z; j> c , 7)} = b(Z; 7) + V> C X, 

where b(Z; 7) is an explicitly specified parametric 
function of Z, typically a linear function "f T Z for 
continuous Z. We refer to a regression model for Y, 
given X and Z, as "explicit." Under model (4), 
log{ OR c (Z)} = tpa so that the condition in (2) is 
met. This restriction is not crucial though; in prin- 
ciple we can add arbitrary interaction terms be- 
tween X and any of the components of Z. Maximum 
likelihood estimates (MLEs) of (^,7) are obtained 
by maximizing the conditional (given Z) likelihood 

n 

J[Pv"(Y<>=y'!,Y>=y}\Z i ) 

(5) " , 

n 1 

= n n pr ( y = yf\ x = x > ^ ^,7), 

1 = 12 = 

where the equality follows from (3a) and (3b). If (3b) 
is violated, then Y 1 and Y° are not conditionally 
independent, given Z, and the right-hand side of (5) 
is not a proper likelihood. However, if (3a) holds 
(and model (4) is correct), then each separate term 
Pr(Y = yf\X = x, Zi]ip c ,j) in (5) equals the true 
marginal (over Y l 1 ~ x ) likelihood Pi(Yf = yf\Zi). It 
follows that the obtained estimate of tp c is consistent 
under (3a), regardless of whether (3b) holds or not. 

4.1.1 Disadvantages 

(1) If Z is high dimensional, it may be difficult 
to well specify the function b(Z;j). 

(2) If Z is not directly observed, as in twin stud- 
ies, explicit specification of b(Z; r y) is not possible. 

(3) In principle, explicit regression models can be 
adapted for risk differences and risk ratios, by using 
identity links or the log links, respectively. However, 
absolute risks and logarithms thereof are, unlike log 
odds, restricted to ranges (0,1) and (0, 00), respec- 
tively. Thus, models utilizing identity links or log 
links have to be fitted under these restrictions, which 
can be rather inconvenient, or they may produce es- 
timates which are outside the supported ranges. 

4.2 Conditional Logistic Regression 

Conditional logistic regression mitigates the prob- 
lems with an explicit specification of b(Z; 7). In con- 
ditional logistic regression, the function b(Z; 7) in (4) 
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is replaced with a scalar pair-specific parameter b: 

(6) logit{Pv(Y = l\X,Z)} = b + ip c X. 

Nothing is assumed about b, and thus the risk for 
model misspecification in b(Z; n f) is avoided. A MLE 
of ip c is obtained by conditioning on Y® + YJ 1 , for 
each pair i, and maximizing the resulting condi- 
tional likelihood, which under (3a) and (3b) is given 
by 

p) n TT^- 

Since the conditional likelihood (7) does not involve b 
(or Z), it can be used, even if Z is not directly ob- 
served, as in twin studies. The MLE of ijj c obtained 
by maximizing (7) is given by 



(8) 



i> c . dr = log(U/V), 



with standard error s.e.{ip c , c i r } = \J\J~ X + V -1 . 
4.2.1 Disadvantages 

(1) The constant odds ratio assumption (2) is 
crucial in conditional logistic regression. If an inter- 
action term is included between b and X in model (6), 
then b cannot be eliminated by conditioning argu- 
ments. If (2) is violated, then i^c.cir converges to 
a weighted average of the Z-specific odds ratios; see 
Section 4.4. 

(2) ipc.dr is generally inconsistent if (3b) is vio- 
lated. There is an important exception. Define the 
null hypothesis 

(9) H : (2) holds, with ip c = 0. 

In Appendix B we show that ip c .cir converges to 
under Ho and (3a), regardless of whether (3b) holds 
or not. 

(3) Conditional logistic regression cannot be used 
for other measures of association than the log odds 
ratio, since for other links than the logit link, b can- 
not be eliminated by conditioning arguments. 

4.3 Mixed Model 

In the mixed model approach, b is assumed to 
be random, with a specified parametric distribution 
Pr*(b;0). The MLE of (V> c ,0) is obtained by maxi- 
mizing the marginal (over b) likelihood 



i=l 



(10) 



IK 



i=i 



lPr(Y = y?\X = x,b i ;i>c)\;l 



,x=0 



where the equality follows from (3a) and (3b), and 
the expectation on the right-hand side is taken over 
Pr*(6;0). Neuhaus et al. (1994) showed that the 
mixed model estimate of tp c is identical to ipc.dn un ~ 
der mild conditions. This implies that the two meth- 
ods are equally efficient, and that the mixed model 
is robust against misspecification of Pr* (&;#). 

4.3.1 Disadvantages 

(1) The constant odds ratio assumption (2) is 
crucial in the mixed model. Neuhaus et al. (1994) 
showed that the mixed model is saturated, under 
mild conditions, so that an interaction term be- 
tween b and X would lead to identifiability problems. 

(2) The mixed model estimate of ip c is generally 
inconsistent if (3b) is violated. 

(3) In principle, the mixed model can be adapted 
for risk differences and risk ratios, by using identity 
links or the log links, respectively. In practice, these 
adaptations require that the model is fitted under 
restrictions, or it may produce estimates outside the 
supported ranges. 

(4) Explicit maximization of the likelihood in (10) 
requires numerical techniques. This makes the meth- 
od less transparent and relatively computer-intensive. 

4.4 Exposure-Discordant Crude Analysis 

The methods described in Sections 4.1-4.3 all tar- 
get the conditional odds ratio, OR c {Z). Matched da- 
ta can also be used to estimate a standardized odds 
ratio. Let n yx denote the number of subjects in the 
sample with Y = y and X = x, so that noo = U + W, 
n i = V + W, m = V + T and n u = U + T. Un- 
der (3a) we have that Pr*(Y x = y x ) = E* z {Pr(Y = 
y x \X = x, Z)}, that is, Pr*(Y x = y x ) equals the prob- 
ability of Y = y x given X = x, standardized to Pr*(Z). 
Thus, under (3a) a consistent estimate of ip s is given 
by the crude log odds ratio 



s. crude 



log 



/ nun QQ \ 
V n oi^io/ 



The standard error of i^s. crude (see Appendix A) 
is given by 



12 \ n n +n Q1 +n 10 + n QQ - 2n . 

The first four terms under the square root sign can 
be recognized from the usual standard error formula 
for a log odds ratio, and the fifth term is an adjust- 
ment for non-i.i.d. observations. 

We remind the reader that the interpretation of ip s 
depends on what distribution of Z that ip s is stan- 
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dardized to. Under exposure-driven matching, 
Pr*(Z) = Pr(Z|X = 1) so that ip a is standardized to 
the distribution of Z among the exposed. In a twin 
study, Pr*(Z) = Pr(Z| discordant in X) so that tp s 
is standardized to the distribution of Z among the 
exposure-discordant pairs. 

4.4.1 Advantages One potential disadvantage of 
the exposure-discordant crude analysis is that it es- 
timates a parameter that is rather nonstandard. In 
the simple scenario that we consider (i.e., 1:1 match- 
ing and no additional covariate adjustments) the 
exposure-discordant crude analysis does not suffer 
from any of the other disadvantages listed in Sec- 
tions 4.1-4.3. The relative advantages of the expo- 
sure-discordant crude analysis are threefold: 

(1) The exposure-discordant crude analysis re- 
lies on fewer assumptions than the other methods. 
Specifically, it does not rely on assumptions (2) 
and (3b). 

(2) The exposure-discordant crude analysis is 
computationally simple. 

(3) In the exposure-discordant crude analysis, 
the standardized probabilities Pr*(Y° = 1) and 
Pr*(y x = 1) can be estimated separately, and can 
subsequently be used to construct any standardized 
measure of the X-Y association, for example, risk 
difference or risk ratio. For this reason, the exposure- 
discordant crude analysis easily extends to nonbina- 
ry outcomes as well. For survival outcomes, for in- 
stance, an exposure-discordant crude analysis can be 
used to produce standardized Kaplan-Meier curves. 

4.4.2 A closer comparison with conditional logis- 
tic regression Because ip s and tp c are different pa- 
rameters, it is not meaningful to compare the meth- 
ods in Sections 4.1-4.3 with the exposure-discordant 
crude analysis in terms of efficiency of estimates. 
However, we can make a meaningful comparison in 
terms of statistical power. Define the null hypothesis 



(13) 



H$ : ij s = 0. 



It is easy to show that Ho in (9) implies Hq, re- 
gardless of whether (3a) and (3b) hold or not. If 
both (3a) and (3b) hold, then a Wald test of Ho is 
based on the statistic T c = tp c .dr/s-e.(tjj c , c i r ). If (3a) 
holds, then a Wald test of Hq is based on the statistic 

.crude/ s-^-i^ s. crude)- I n Appendix B we show 
that T c and T s are asymptotically equal. It immedi- 
ately follows that the two Wald tests have the same 
asymptotic power, for any fixed alternative. 



One potential argument against the exposure- 
discordant crude analysis is that it does not inform 
us about the exposure effect in the source popula- 
tion. Under exposure-driven matching (and no con- 
founders apart from Z), ip s is a causal effect in a fic- 
titious population where Z is distributed as among 
the exposed. In a twin study restricted to the expo- 
sure-discordant pairs (and no confounders apart 
from Z), ifj s is a causal effect in a fictitious popula- 
tion where Z is distributed as among the exposure- 
discordant pairs. The effect in these fictitious popu- 
lations may differ from the effect in the source pop- 
ulation, and it is not always obvious whether these 
fictitious population effects are relevant targets for 
inference. However, a closer examination shows that 
a similar argument can be used against the meth- 
ods that target ip c as well, and in particular against 
conditional logistic regression. Conditional logistic 
regression relies on the constant odds ratio assump- 
tion (2). This is a very strong assumption, which in 
any real scenario is most likely violated, to some ex- 
tent. Regardless of whether (2) holds or not, VVefr- 
converges to 



lo f 



Pr*^ 1 = 1,Y°=0) 



= log 



Pr*(Y° = 1,Y X = 



0) 



1,Y° = Q\Z)} 



£*{Pr*(Y° = 1,Y 1 =0\Z)} 



(3a), (36) 
= log 



E* z {Pr(Y = 1\X = 1,Z) Pr(Y = Q\X = 0,Z)} 
E* z {Pr(Y = 1\X = 0,Z) Pr(F = 0\X = 1, Z)} 



= log[E z {W(Z)OR c (Z)}], 
(14) 

where 

W(Z) 

Pr(Y~ = 1\X = 0, Z) Pr(Y = Q\X = 1,Z) 
~ E* z {Pr(Y = 1\X = 0, Z) Pr(y = fjjx = 1, Z)} ' 

In (14), the average is taken over Pr*(Z), that is, 
the same distribution of Z as being standardized 
to in the exposure-discordant crude analysis. Thus, 
if (2) is violated, then conditional logistic regres- 
sion does not inform the analyst about exposure ef- 
fects outside the fictitious population characterized 
by Pr*(Z), to any wider extent than the exposure- 
discordant crude analysis. Furthermore, whereas i/j s 
has a clear interpretation as a population causal ef- 
fect (when there are no confounders except Z), the 
weighted average in (14) does not have any such 
simple interpretation. 
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An analyst is always at the liberty to assume a pri- 
ori that (2) holds. But equally well, the analyst may 
assume that the effect in the fictitious population, 
characterized by Pr*(Z), is equal to the effect in the 
source population, characterized by Pr(Z). Neither 
of these assumptions is stronger than the other, since 
neither of them implies the other. Furthermore, with 
paired data and Z being unobserved (as in twin 
studies), these assumptions are both untestable. 

Although our focus is on cohort studies, we end 
this section by making a comparison with case con- 
trol studies. A matched case control study is de- 
signed analogously to a matched cohort study, but 
the roles of exposure and outcome are "switched" in 
the sampling scheme; see Section 1. Thus, in a match- 
ed case control study the crude sample log odds ra- 
tio consistently estimates the standardized log odds 
ratio 

\ E z {Vt{X = \\Y = 1, Z)}E* z {Pr(X = Q\Y = 0, Z)\ 
° g [E* {Pr(X = 0\Y = 1, Z)}E* Z {Pr(A = 1\Y = 0, Z)}\ ' 

(15) 

where Pr*(Z) = Pr(Z|Y =1). In contrast to condi- 
tional odds ratios, standardized odds ratios are not 
symmetrical. That is, the log odds ratio in (15), in 
which X appears to the left of the conditioning sign, 
cannot be written as tp s , in which X appears to the 
right of the conditioning sign. Hence, the log odds 
ratio in (15) has no simple interpretation as a causal 
effect of X on Y on the log odds ratio scale, even if 
there are no confounders apart from Z. 

5. ANALYSIS OF TWIN DATA 

In contrast to a regular 1:1 matched cohort study, 
a twin cohort also contains pairs that are concordant 
in the exposure. In this section we describe three 
common methods to incorporate the exposure-con- 
cordant pairs in the analysis. 

To deal with twin studies we extend the notation 
slightly. Let X^j and Yij denote X and Y for twin j 
in pair j, js (1,2). We suppress the index i when not 
needed, so that Xj and Yj denote X and Y for twin 
j, j £ (1,2), within an arbitrary pair i. As before, Z{ 
represents all the unobserved factors that are com- 
mon within a twin pair. As discussed in Section 1, 
the exposure-discordant pairs in a twin cohort can 
be viewed as a 1:1 matched cohort. However, some 
care must be taken. All methods discussed in Sec- 
tion 4 rely on assumption (3a), and conditional logis- 
tic regression (Section 4.2) and mixed models (Sec- 



tion 4.3) rely in addition on assumption (3b). For 
an exposure-discordant twin pair we have that 

Pr*(y° = y°,y 1 = y 1 |Z) 

(16) 

= Pr(Y} = y°, Y f = y x \Xj = 0, A> = 1, Z). 

The right-hand side of (16) can be factorized into 
Pr(Y j = y°\Xj = 0, Z) PT(Yj> = y l \X, y = 1, Z) if 

(17a) Yj± X f \(Xj,Z) 

and 

(17b) Y 1 ±Y 2 \(X 1 ,X 2 ,Z). 

Thus, the analogs to (3a) and (3b) for twin data are 
given by (17a) and (17b), respectively. Under (17a), 
(3a) holds, so that the explicit model (Section 4.1) 
and the exposure-discordant crude analysis (4.4) 
are valid when applied to the exposure-discordant 
pairs. We note though that it is typically not pos- 
sible to fit an explicit model to twin data, since Z 
is typically unobserved. If, in addition, (17b) holds, 
then (3b) holds as well, and all methods in Section 4 
are valid when applied to the exposure-discordant 
pairs. 

Potentially, (17a) could be violated if Xj> has 
a causal effect on Yj, that is, if the exposure for one 
twin affects the outcome for the other twin. Simi- 
larly, (17b) could be violated if Yji has a causal ef- 
fect on Yj, that is, if the outcome of one twin affects 
the outcome for the other twin. 

5.1 All-Pair Crude Analysis 

Let r yx denote the number of subjects in the full 
(i.e., both exposure-concordant and exposure-dis- 
cordant pairs) sample with Y = y and X = x. One 
simple way to make use of all twin pairs in the anal- 
ysis is to compute the crude sample log odds ratio 

(18) ^m.crude = log ( Jj-gg ) , 

which consistently estimates the marginal log odds 
ratio tfj m . Thus, unlike the exposure-discordant crude 
analysis (Section 4.4), the all-pair crude analysis 
does not adjust for confounding by Z. The standard 
error of ip m . C rude is rather complicated, due to the 
paired nature of the data. In Appendix A we pro- 
vide an analytic expression for the standard error. 
We note that the standard error can also be com- 
puted numerically, through Generalized Estimating 
Equation (GEE) procedures, which are implemented 
in most common statistical softwares. 
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5.2 Decomposition into Within- and 
Between- Effects 

In twin studies with continuous exposures and 
outcomes, a popular regression model is 



with 



(19) 



EiYjlX^X?) = /3 + p w (Xj -X) + f3 B X 

= A) + PwXj + p B x, 

with X = ^i±^a and /3' B = fa - /3 W (Carlin et al., 
2005). In (19), the pair-specific mean X is thought of 
as conveying information about the confounders Z, 
which are not observed, but constant within each 
pair. Thus, the parameter /?b is thought of as quan- 
tifying the strength of confounding, a "between ef- 
fect," and the parameter /3\y is thought of as quan- 
tifying the adjusted X—Y association, a "within ef- 
fect." When X and Y are binary, a natural analog 
to (19) is 

logit{Pr(Y i = ll^,^,)} 

= Po + PwXj + P B X. 

To see the connection with the methods described 
in this paper, note that 

/3 W = logit{Pr(y j = l\Xj = l,X f = 0)} 

- logit{Pr(y j = l\Xj = 0,X f = 1)} 

= logit[ J B{Pr(y i = l\Xj = l,X r = 0,Z)\ 



(20) 



X, 



l,X f 



- logit[£{Pr(Yj- = l\Xj = 0, Xj 



0}] 

■■1,Z) 



X, 



0,X f = l}} 



= logit[E*{Pv(Y j = l\X J = l,Z)}} 
- logit[£;*{Pr(y i = l\Xj = 0, Z)}] 

where Pr*(Z) = Px(Z\X x / X 2 ), and the third equal- 
ity follows from assumption (17a). Thus, the within- 
effect f3w is identical to the log odds ratio standard- 
ized to the distribution of Z among the exposure- 
discordant pairs. This argument shows that the de- 
composition into within- and between-effects is a le- 
gitimate method for binary exposures, which was 
questioned by Carlin et al. (2005). 

When X is binary, X can only take values 0, 0.5 
and 1. Thus, it is feasible to replace the linear term 
A) + ft-gX in (20) with one parameter for each level 
of X, that is, 

(21) logit{Pr(y j = l\Xj,X f )} = P w Xj + m(X), 



(22) 



m(X) = /3 l(X = 0) 



+ /3 . 5 l(X = 0.5)+/3il(X = l). 



It is easy to show that the model in (21) is satu- 
rated (i.e., imposes no restrictions on Pr(Yj\Xi,X2), 
which implies that the MLE of /3\y based on (21) is 
identical to the crude sample log odds ratio in (11). 

5.3 Mixed Model 

The model in (6) can be fitted to all pairs, assum- 
ing a parametric distribution of b indexed with 9. 
Parameter estimates are obtained by maximizing 
the marginal (over b) likelihood 



Zi \Xi\,Xi 



(23) 



1 1 ^bi\Xii,X i: 



,{Pr(Ya = yix,Y i2 = yi2\Xa,Xi 2 ,Zi)\ 

Xa,Xi2} 
Pr(Yy = yij\Xij , bf, ip c ) > 

j=l J 

Xn , Xi2 ; 9 



This approach, however, is associated with a se- 
vere problem which is often overlooked. Typically, 
the distribution of b is specified to not depend on 
(X±,X2), for example, a normal distribution with 
fixed but unspecified mean and variance. However, 
from the expression in (23) it is clear that this proce- 
dure only produces a proper likelihood under the ad- 
ditional assumption that b _L (Xi,X 2 ). In standard 
textbooks, this assumption is often stated without 
justification or interpretation (e.g., Fitzmaurice et 
al., 2004, page 329). Since b is supposed to repre- 
sent the potential confounders Z, we would not gen- 
erally expect that b _L (Xi,X 2 ). Indeed, if Z (and 
thus b) is independent of (Xi,X 2 ), it cannot be 
a confounder, and there is no need to adjust for Z in 
the first place. We note that in matched cohort stud- 
ies, (Xi,X2) is constant and equal to (0,1) for all 
pairs, so that an association between b and (X\, X 2 ) 
is ruled out by design. When b is associated with 
{X\,X2), the aforementioned procedure can yield 
severely biased estimates (Neuhaus and Kalbfleisch, 
1998; Neuhaus and McCulloch, 2006). In general, the 
proper marginal likelihood is obtained by averaging 
over a specified distribution Y > r{b\X\,X2) for each 
pair. This procedure can be very computer intensive, 
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and cannot be carried out with standard software. 
As noted by Neuhaus and Kalbneisch (1998) and 
Neuhaus and McCulloch (2006), there is a simple so- 
lution to this problem. Suppose that given (Ai, A2), 
b has a normal distribution where the mean, but not 
the variance, depends on (Ai, A 2 ). Without loss of 
generality, we can formulate this as 

(24) b = d + m(X), 

where m(A) is defined in (22) and d|Ai,A 2 ~ 
A(0,cr 2 ). Under (24), model (6) translates to 

(25) logit{Pr(y i = l\Xj,Z)} = d + ip c Xj + m(X), 

where d _L (Ai,A 2 ) by construction. The model 
in (25) can be fitted with standard mixed model 
software. By comparing the model in (25) with the 
model in (21), we see that the solution proposed by 
Neuhaus and Kalbneisch (1998) and Neuhaus and 
McCulloch (2006) can be thought of as combining a 
mixed model with a within-between decomposition. 

Neuhaus and Kalbneisch (1998) and Neuhaus and 
McCulloch (2006) observed that for various scenar- 
ios, the estimate of ip c obtained by combining a mixed 
model with a within-between decomposition is nearly 
identical to ip c .clr- Neuhaus and McCulloch (2006) 
gave a theoretical motivation for this observation. 
We note that there are situations when the two es- 
timates may differ; see Brumback et al. (2010) for 
an example. 

6. SIMULATIONS 

6.1 Part I: Efficiency and Power 

In this section we compare the performance of the 
methods described in Sections 4 and 5, in terms of 
efficiency and power. To enable a fair comparison, 
we analyze the simulated data so that all assump- 
tions hold, for each method respectively. In these 
simulations, twin pairs were generated. We empha- 
size that this simulation scheme covers matched data 
as well, since the exposure-discordant twin pairs can 
be viewed as a matched cohort. For each twin pair, 
the random variables (Ai, X2, b, Yi, Y2) were gener- 
ated from the model 

Prpfi = 1\X 2 = 0) = PrfXg = l\Xi = 0) 

Pr(Xi = 0\X 2 = 0) Pr(X 2 = 0\X X = 0) 
1 



(26) 



= p= r 

Pr(Xi = 1, X 2 = 1) Pi(X 1 =0,X 2 = 0) 
Pr(Xi = l,X 2 = 0) Pr(X! = 1, X 2 = 0) 
b\X 1 ,X 2 ~N{6X,l}, 
Y 1 ±Y 2 \(X 1 ,X 2 ,b), 
Y^X^X^b), 
[logit{PT(Y j \X j ,b)} = b + ^ c X j . 



We highlight a few aspects of the model in (26): 

(1) Under model (26), assumptions (2), (17a), 
(17b) and (24) all hold. 

(2) The restriction Pr(X x = 1\X 2 = 0) = Pr(A 2 = 
l\Xi = 0) in the first row of (26) follows by symme- 
try. 

(3) It may appear natural to first specify a mar- 
ginal distribution of 6, then specify a conditional dis- 
tribution of (Ai, A 2 ), given b. The reason for doing 
it the other way around is twofold. First, it allows us 
to directly control the rate of exposure-discordance 
through (j). Second, it allows us to easily formulate 
the distribution of b given {X^X^) in such a way 
that (24) holds. 

(4) It follows from results in Chen (2007) that 
the joint distribution of (Ai, A 2 ) is completely de- 
fined by p and 4>. It also follows that p and <fi are 
variation independent (i.e., the value of p does not 
restrict the value of <£>, and vice versa). 

(5) The values of 4> and 9 determine the degree 
of conditional association of X\ and X2, given b. 
It can be shown (see Appendix C) that for 9 = 
2ylog(0), Ai _L A 2 |6. For convenience, we have used 
9 = 2y / log(</>) throughout. We note though that none 
of the methods presented relies on this restriction. 

In the first set of simulations, we used <j) = 4 and 
tp c = 0, that is, the data were generated under Ho 
in (9). For these values, ip s = and tp m = 1.28, which 
implies a severe degree of confounding. Further, 
Pr(Ai / A 2 ) = 0.33, and Pr(X x / X 2 ,Y 1 / Y 2 ) = 
0.11. We generated 5000 samples, each of size n = 
2000. Each sample was analyzed with 6 different 
methods: 

(1) Explicit regression model logit{Pr(Y = 1| 
A, b)} = 70 + Jib + tp c X (Section 4.1). We remind 
the reader that for twin data, b (or rather, Z) is 
typically unobserved, which rules out the use of an 
explicit model. For a regular matched cohort, the 
explicit model is a viable choice. Thus, the model 
was only fitted to the exposure-discordant pairs. 

(2) Conditional logistic regression (Section 4.2). 

(3) Mixed model fitted to the exposure-discordant 
pairs (Section 4.3). We used the model Pr(Y = 1| 
X,b) = b + ifj c X, with 6|Ai/A 2 ~A(#,cr 2 ). 

(4) Exposure-discordant crude analysis (Sec- 
tion 4.4). 

(5) All pair crude analysis (Section 5.1). 

(6) Mixed model fitted to all pairs (Section 5.3). 
We used the model Pr(Y = 1| A, b) = b + ip c X, with 
b\X u X 2 ~ N{9X,a 2 ). 
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Table 2 

Simulation results for ip c = 0, <f> = 4 



Analysis method Target parameter Mean est Emp s.e. Th s.e. 



1. 


Explicit 




= 


0.00 


0.13 


0.13 


2. 


Cond log reg 




= 


0.00 


0.13 


0.13 


3. 


Mixed discordant 




= 


0.00 


0.13 


0.13 


4. 


Crude discordant 


4> s 


= 


0.00 


0.11 


0.11 


5. 


Crude all 


4>m = 


-- 1.28 


1.28 


0.08 


0.08 


6. 


Mixed all 


"4>c 


= 


0.00 


0.12 


0.12 



Table 2 displays the mean (over samples) point es- 
timate, the empirical standard error and the mean 
theoretical standard error for each analysis, respec- 
tively. We note that all methods yield virtually un- 
biased estimates of their target parameters. For all 
methods the mean theoretical standard error is iden- 
tical to the empirical standard error, to the second 
decimal. 

To compare the methods in terms of their power 
to reject Ho, we carried out a second set of simula- 
tions. We used <f> = 4 and varied ip c over the range 
(0,0.6). For each value of ip c , we drew 5000 sam- 
ples of 2000 pairs each. Each sample was analyzed 
using methods 1, 2, 3, 4, 6. Figure 3 displays the 



empirical rejection probability (i.e., the power) for 
a Wald test at 5% significance level, for each method 
as a function of ip c . We observe that the all meth- 
ods have almost identical power, for the simulated 
scenarios. 

In a third set of simulations, we used ip c = 0.4 and 
varied 4> over the range (4,22). These values cor- 
respond to the range (0.33,0.13) for Pr(Xi ^X 2 ), 
and the range (0.11,0.03) for Pr(Xi / X 2 , Yy + Y 2 ). 
For each value of 4>, we drew 5000 samples of 2000 
pairs each. Each sample was analyzed using meth- 
ods 1, 2, 3, 4, 6. Figure 4 displays the power for each 
method as a function of <p. Again, we observe that 
there is almost no difference between the methods, 
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4> 

Fig. 4. Simulation results for ip c = 0.4, <f> € (4, 22) . 



in terms of power, even when the discordance rate 
is very low. 

Some care must be taken when interpreting power 
curves. In small samples, parameter estimates can 
be biased, which may lead to an increased probabil- 
ity of rejection, both under the alternative hypothe- 
sis and under the null hypothesis. Thus, an increased 
power under the alternative hypothesis may come 
at the cost of a violated significance level under the 
null hypothesis. Figure 3 shows that the nominal 
significance level (= 5% at tp c = 0) is preserved for 
all methods when = 4. To confirm that the nom- 
inal significance level is preserved across the range 
(f> G (4, 22), which generated the power curves in Fig- 
ure 4, we carried out a fourth set of simulations, 
using ip c = and varying eft over the range (4,22). 
For each value of </>, we drew 5000 samples of 2000 
pairs each. Each sample was analyzed using methods 
1, 2, 3, 4, 6. Figure 5 displays the rejection proba- 
bility for each method as a function of </>. We ob- 
serve that the rejection probability is close to 0.05, 
for all methods and all values of eft in the simulated 
range. 

Table 2 and Figure 3 indicate that methods 1-4, 
and 6 are unbiased under the null hypothesis. Addi- 



tional simultions have confirmed that the methods 
are unbiased under various alternative hypotheses 
as well (data not shown). 

6.2 Part II: Sensitivity to Underlying 
Assumptions 

In this section we demonstrate through examples 
that the explicit model, conditional logistic regres- 
sion and the mixed model, can yield biased esti- 
mates, if their underlying assumptions are violated. 

We first consider the assumption that b _L (Xi, X2), 
which is often made for mixed models; see Section 5.3. 
Toward this end we reanalyzed the 5000 simulated 
samples which generated Table 2, now fitting the 
mixed model Pr(y = 1\X, b) = b + ip c X to all pairs, 
with b\X 1 ,X 2 ~ N(0,a 2 ). We obtained a mean esti- 
mate of tp c equal to 1.32, which is indeed biased as an 
estimate of the true value ip c = 0. We note that this 
mean estimate is very close to the ip m . cr ude (= 1-28) 
in Table 2. This further demonstrates that ignoring 
the association between b and (X^, X2) produces an 
estimate which is not adjusted for Z. 

Next, we consider the independence assumption 
(3b)/(17b), which is a prerequisite for conditional 
logistic regression and mixed models. Toward this 



MATCHED COHORT STUDIES 



13 



o 
d 



a> 
3 
o 

CL 



o 



C\J 

o 
d 



1 .Explicit 

2. Cond log reg 

3. Mixed discordant 

4. Crude discordant 
6.Mixed all 




10 



15 



"T" 
20 



Fig. 5. Simulation results for tp c = 0, £ (4, 22) . 



end we consider a simple scenario for which 
(27) (Y 1 ,Y 2 )±Z\(X 1 ,X 2 ), 



so that ip c = ip s 
( Pr ( Y j = 



ipm', see Section 2. We define 



(28) 



l\Y f 



0, Xj — 1, Xj 



l\Xi = 1,X. 



Pr(y i = l|^/ = 0,X,=0,X 
■Pr(Y J= 0,Y f 

/(p r (y, = o,y, 



o\x 4 



0) 

1) 

0) 



-p. 

- q, 



Pr(y i = i,y,, = o|^ 



l|Xi = 1, Xj/ 
l,X, 



0) 
= 0) 
= 0)) 



c. 



It follows from results in Chen (2007) that the joint 



distribution of Yj and Yy among the exposure-dis- 
cordant pairs, Pr(Yj,Yj/\Xj = l,Xj> =0), is com- 



pletely defined by the variation independent param- 
eters p, q and c. c quantifies the degree of deviation 
from (17b); in particular, (17b) is violated when 
1. It is easy to show that assumption (17a) is 
logically compatible with all joint values of (p,q,c). 
Thus, we proceed by assuming that (17a) holds, so 
that the exposure-discordant crude analysis consis- 
tently estimates ip s = ip c . Combining (27) and (28), 
and using results in Chen (2007), gives that i/j c . c i r 
converges to 



(29) 



log 



p(i - <i) 

q(l-p) 



whereas the true value of ip c {= = V'm) is given by 
P( l ~q)\ , ,__f 1-q + qc' 



(30) log 



log 



](1 -p) J ° [ 1 - p + pc_ 

Thus, the true value of ip c depends on the associ- 
ation between Y\ and Y 2 through the second term 
in (30), whereas the asymptotic limit of ip c . c i r does 
not. We used p = 0.3, q = 0.1, and c = 4. For these 
values, ipc = 0.97, whereas the asymptotic limit of 
ipc.dr equals 1.35, for conditional logistic regression. 
We generated 5000 samples, each consisting of n = 
2000 exposure-discordant twin pairs. For each pair, 
the random variables (Yi,!^) were generated from 
the model in (28). Each sample was analyzed with 
conditional logistic regression (method 2), the mixed 
model (method 3) and the exposure-discordant crude 
analysis (method 4). For these methods, we obtained 
an average estimate of tp c equal to 1.35, 1.26 and 
0.97, respectively. Thus, both conditional logistic re- 
gression and the mixed model produced biased esti- 
mates, whereas the exposure-discordant crude anal- 
ysis estimate was unbiased. 

Next, we consider misspecification of the func- 
tion b(Z;-f), in the explicit model. We generated 
5000 samples, each consisting of n = 2000 twin pairs. 
For each twin pair, the random variables (Z, X\,X2, 
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Yi,Y 2 ) were generated from the model 
Z = (V,W), 

V ±w, 

V~N(0,1), 
W~Ber(0.5), 

x x ±x 2 \z, 

(31) <J logit{Pr(X i = l|Z)} 

= a + aiF + + a-sVW, 
Y 1 ±Y 2 \(X 1 ,X 2 ,b), 

logit{Pr(y i = l\Xj,Z)} = b(Z; 7) + i/j c Xj, 
I 6(Z; 7) = 70 + mV + I2W + 73 VW, 

with «o = 2, «i = «2 = 1, «3 = —1.5, 70 = —2, 71 = 
72 = —1, 73 = 1.5, ip c = 1.3. Each sample was ana- 
lyzed with the misspecified explicit model 
logit{Pr(y i = l\Xj, Z)} = 70 + 7l V + l2 W + ^ c Xj. 
We obtained an average estimate of ij) c equal to 0.69, 
which is sever ly biased. 

Finally, we consider the assumption that the ran- 
dom effect b(Z;-f) is normally distributed, which 
is commonly made for mixed models. Toward this 
end we reanalyzed the 5000 samples generated from 
model (31), now fitting the mixed model Pr(Y = 
1\X, b) = b + ip c X to the exposure-discordant pairs, 
with b\X\ 7^ X 2 ~ N(6,a 2 ). Under the data gener- 
ating model, the conditional distribution of b(Z;j), 
given X\ 7^ X 2 is rather complicated, and, in par- 
ticular, not normal. We obtained an average esti- 
mate of ip c equal to 1.30, which is identical to the 
true value, to the second decimal. This finding sup- 
ports the theoretical results in Neuhaus et al. (1994), 
which state that the mixed model is robust against 
the normal random effect assumption. 

7. REAL DATA EXAMPLES 

7.1 Matched Cohort Data 

The first example is taken from a matched cohort 
study that aimed to investigate the effect of hys- 
terectomy on risk for CVD (Ingelsson et al., 2010). 
A common surgery among perimenopausal women, 
hysterectomy is often performed on benign indica- 
tions, but its long-term consequences are not fully 
understood. The study is based on the Swedish In- 
patient Register, where all women who underwent 
hysterectomy between January 1973 and December 
2003 (227,389 individuals) were identified. For each 
hysterectomized woman, three women who never had 
hysterectomy were randomly selected from the Reg- 
ister of Total Population. The three unexposed wom- 
en were individually matched to the exposed woman 



by birth year, year of hysterectomy, and county of 
residence at year of hysterectomy. 

Information on CVD status was obtained from 
the Inpatient Register and information of follow up 
through record linkage to the Cause of Death Reg- 
ister, Emigration Register and Cancer Register. To 
avoid bias from CVD events occurring in relation 
to the hysterectomy surgery, the exposed women 
started their risk time from 30 days after hysterec- 
tomy; they were then followed until CVD, heart fail- 
ure, cervical, corpus or ovarian cancer, death, emi- 
gration or end of study (Dec 31, 2003). Similarly, un- 
exposed women started their risk time 30 days after 
the date of matching, that is, the date of hysterec- 
tomy of the corresponding exposed woman. For fur- 
ther details on the study, see Ingelsson et al. (2010). 

In the current analysis we focus on 1:1 matched 
studies with binary outcomes. We constructed a bi- 
nary outcome by defining Y = 1 for women who de- 
veloped CVD during follow-up, and Y = for the 
remaining women. We constructed a 1:1 matched 
sample by matching each exposed woman to one un- 
exposed woman, which was randomly selected from 
the three unexposed women in the same set. Af- 
ter the exclusions described above, we ended up 
with 52,814 1:1 matched pairs, of which 6712 were 
discordant in both the exposure and the outcome. 
The data were analyzed with methods 1-4 described 
in Section 6. For method 1 we used the explicit 
model logit{Pr(Y" = 1|Z,AT)} = 70 + 71 [birth year] + 
72 [year at hysterectomy] + 73 [county] + i/j c X, 
where 73 is a factor parameter with one level for 
each county. 

Table 3 displays the results. For all three methods, 
there is a significant (at 5% level) association be- 
tween hysterectomy and CVD. The point estimates 
obtained by conditional logistic regression and expo- 
sure-discordant crude analysis are almost identical, 
whereas the point estimate obtained from the mixed 
model is twice as large. According to theory (Neuhaus 
et al., 1994) we would expect the mixed model esti- 
mate to be identical to the estimate obtained from 

Table 3 

Analysis results for the 1:1 matched subset of the 
hysterectomy- CVD data 



Analysis method 


Target parameter 


Point est 


95% CI 


1. Explicit 


Ipc 


0.03 


-0.02, 0.08 


2. Cond log reg 


Ipc 


0.03 


-0.02, 0.08 


3. Mixed discordant 


Ipc 


0.03 


-0.02, 0.08 


4. Crude discordant 




0.03 


-0.02, 0.07 
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Table 4 

Analysis results for the full 1:3 matched 
hysterectomy-CVD data 



Analysis method 


Target parameter 


Point est 


95% CI 


1. Explicit 


Ipc 


0.06 


0.02, 0.09 


2. Cond log reg 


Ipc 


0.06 


0.02, 0.09 


3. Mixed discordant 


Ipc 


0.06 


0.02, 0.09 


4. Crude discordant 


Ips 


0.05 


0.02, 0.09 



conditional logistic regression. Indeed, methods 1— 
4 all give identical estimates to the second deci- 
mal. 

Although our focus is on 1:1 matching, all meth- 
ods in this paper generalize directly to m:n match- 
ing (see Section 8). Table 4 displays the results when 
the whole 1:3 matched data is analyzed, using meth- 
ods 1-4 described in Section 6. 

7.2 Twin Data 

The second example is from a twin study of the as- 
sociation between fetal growth and asthma (Ortqvist 
et al., 2009). Several studies have shown that there is 
an association between asthma and low birth weight. 
This association could potentially be explained by 
a causal effect of impaired fetal growth on asthma, 
but may also be explained by confounding factors. 
In particular, gestational age is correlated with both 
birth weight and asthma, and may confound the 
birth weight-asthma association (Ortqvist et al., 
2009). Twins provide an excellent opportunity to 
separate the causal effect of birth weight from the 
confounding effect of gestational age, and at the 
same time adjust for other shared familial factors. 

All twins born in Sweden in June 1992 to June 
1998 were identified through the Swedish Twin Reg- 
ister at the age of 9 or 12 years. Information on 
asthma and zygosity was collected in telephone in- 
terviews with their parents. Birth weight was re- 
trieved from the Medical Birth Register (MFR). Of 
the 15,808 eligible twins 69% (10,918 individuals) 
had information on asthma and could also be se- 
curely linked to the MFR. In total, there were 3107 
MZ pairs. 1087 pairs were discordant in birth weight 
(exposure), where discordance was defined as a dif- 
ference greater than 400 grams or 15%, and 175 pairs 
were discordant on both birth weight and asthma 
(outcome). 

The data were analyzed using methods 2-6 de- 
scribed in Section 6. Table 5 displays the results. 
The estimates obtained from conditional logistic re- 



Table 5 

Analysis results for the birth weight-asthma twin data 



Analysis method 


Target parameter 


Point est 


95% CI 


2. Cond log reg 


tpc 


0.29 


-0.01, 0.59 


3. Mixed discordant 


Ipc 


0.29 


-0.01, 0.59 


4. Crude discordant 




0.18 


-0.01, 0.37 


5. Crude all 




0.33 


0.16, 0.50 


6. Mixed all 


Ipc 


0.30 


0.00, 0.60 



gression and the exposure-discordant crude analysis 
are both smaller than estimate obtained from the 
all-pair crude analysis. This finding suggests that 
the birth weight-asthma association is inflated by 
shared confounding. Methods 2, 3 and 6 gave very 
similar results, as predicted by theory (Neuhaus et al., 
1994; Neuhaus and Kalbfleisch, 1998). 

8. DISCUSSION 

We have given an overview of the most common 
analysis methods for matched cohort studies. We 
have identified the target parameters in each method, 
outlined the underlying assumptions and compared 
the methods in terms of statistical power. The anal- 
ysis methods that we have considered do not esti- 
mate the same parameter; the exposure-discordant 
crude analysis and the within-between model es- 
timate a standardized odds ratio, whereas the ex- 
plicit method, conditional logistic regression, and 
the mixed model estimate a conditional odds ra- 
tio. Thus, the choice between these methods should 
primarily be guided by the research question being 
asked. In addition, it is also important to consider 
the statistical power, underlying assumptions, com- 
puter intensity and flexibility of the methods. The- 
oretical arguments suggest that when all underlying 
assumptions hold, all methods that we have consid- 
ered have the same statistical power. This was con- 
firmed in our simulation study. In terms of under- 
lying assumptions, the methods differ significantly. 
The exposure-discordant crude analysis relies on few- 
er assumptions than the other methods. In terms of 
computer intensity, the mixed model requires nu- 
merical optimization, and is far more time consum- 
ing than the other methods. In terms of flexibility, 
all methods, except the exposure-discordant crude 
analysis, most naturally target odds ratios. The expo- 
sure-discordant crude analysis however, can easily 
be used to target any measure of the exposure-out- 
come association. 
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We have considered 1:1 matching. Frequently, m:n 
matching is employed, that is, each set is constructed 
by matching m exposed subjects to n unexposed 
subjects. All methods in this paper generalize di- 
rectly to m:n matching. Specifically, the underlying 
assumptions and the interpretation of the target pa- 
rameters remains the same under m:n matching. We 
conjecture that many of the theoretical properties 
that we have derived for 1:1 matching carry over to 
m:n matching as well, for example, the asymptotic 
equivalence in terms of power. However, a strin- 
gent treatment of m:n matching is more difficult. 
For instance, under violation of (2) the probabil- 
ity limit of ipc.cir has no longer an analytic expres- 
sion, which hampers a theoretical comparison with 
the exposure-discordant crude analysis. Comparing 
the methods under m:n is a topic for future re- 
search. 

In practice, it is often desirable to adjust the anal- 
ysis for additional covariates which are not matched 
on. In the model-based methods (i.e., all methods 
except the exposure-discordant crude analysis), ad- 
justment for additional covariates can easily be ac- 
complished by adding the covariates as a regressor 
in the model. It is not obvious though, how to adjust 
for additional covariates in the exposure-discordant 
crude analysis. Extensions of the exposure-discordant 
crude analysis for additional covariate adjustments 
is a topic for future research. 



Define p x 
Pr(X 1= AV 

X 2 



APPENDIX A 

= Pr(y = l|X = x), q = 
--0),q ll =Pr(X 1 =X 2 -- 

cov(yi,y 2 |x 1 =x 2 =o), 



Pr(Jf=l), q 00 = 
:l), (7 d = Pr(X 1 / 

cov(y 1 ,y 2 | 



,11 



X 1 =X 2 = 1), c d = cov{Y u Y 2 \X l ^X 2 ), Vo = logit(po), 

lf) m = logit(pi) - logit(p°) and 1p = (lp ,1pm) T ■ ^m.crude 

in (18) can be expressed as the second element of the 
solution to Yli Ui(ip) = 0, where 



(1 - XaXYn - p°) + (1 - X i2 )(Y i2 -p' 



XaiYa-p^+X^Ya-p 1 ) 

It follows from standard theory that ra 1 / 2 ^ — tp) is 
asympotically normal with mean and variance 



E 



y dlpT var{ [7,(^)1 

where, after some algebra, 

dUi(^)\ _ f-2p°(l-p° 



E 



difj T J 



E< 



-1p v {\ -p v ) 





-2/(1 



P 1 ) 



and 

var{f7»(V))} 



2(l-q)p t, (l-p°) + q 00 c 00 q d c d 

2qp 1 (l-p 1 ) + q 11 c 11 



q d c d 



After additional algebra, the asymptotic variance for 
n x l 2 {i> m .crude -ipm) is obtained as 



+ 



2(l-g)p°(l-p°) 2qp 1 (l-p 1 ) 



q 00 c 00 



(32) + 4|p0(l_„0 



+ 



q ll c n 



4{p°(l-p )} 2 ^(l-p 1 )} 2 

qV 

~ 2q(l-q)p°(l-p°)p 1 (l -p 1 )' 

Replacing the population parameters in (32) with 
their sample counterparts gives the standard error 

for 1pm. crude- 

To derive the standard error formula in (12) we 
note that a regular 1:1 matched cohort can be ob- 
tained by setting q = 0.5, q 00 = q 11 = and q d = 1. 
The expression in (32) then simplifies to 



1 1 

+ 



(33) 



p°(l-p°) p l {l-p l ) 
2c d 



p°(l-p°)p 1 (l-p 1 )' 

Replacing the population parameters in (33) with 
their sample counterparts gives the standard error 
formula in (12). 

APPENDIX B 

Define ip$ = log{ pr*(yol^yi=o) }> hJ : ipl = 0, ipt 
■Pr»(y 1 =l)Pr«(y°=0) 1 „t../,t 



log{ p^[y i= oj pr*(yo = ij }i Hj : ipl = 0. Hj can be tested 
using the likelihood ratio test (LRT) statistic 

sup H i [(pKpW) 1 



± c,LR 



-2 log 



T 



LR 



-2\og< 



and hJ can be tested using the LRT statistic 
f suPntCPooPoiPioPn) 

where p y o y i = Pr* (Y° = y°,Y 1 =y 1 ), and the suprema 
are taken under the restrictions < p y o y i < 1 and 
YlyOy 1 Py°y 1 = 1- Regardless of whether (2), (3a) and 
(3b) hold or not, ip c . c i r and ips. crude are the nonpara- 
metric MLEs of ipt and ipt, respectively. Thus, T^. LR 
and T c are asymptotically equal, and T sLR and T s 
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are asymptotically equal. It is easy to show that Hj 
and Hj are equivalent (i.e., Hj holds if and only if Hj 
holds) , which implies that Tj LR and r| lr are iden- 
tical, which then in turn implies that T c and T s are 
asymptotically equal. 

It is easy to show that Ho and (3a) together im- 
ply Ht, and thus also Hj. Because V> c . c ; r converges 
to ipc, it then follows that ip c .cir converges to un- 
der Ho and (3a). 

APPENDIX C 

Under (26), we have that 
Pr(X u X 2 ,b) 

= _L e {^} 2 /2 Pr(Xl)X2) 

= h{X u b)h{X 2 ,b)e- e2x ^' 4 Pr(X 1 ,X 2 ), 
for some function h(-, ■). X\ _L X 2 \b now implies that 

e- 92XlX2/4 Pr(X l5 X 2 ) = k(X 1 )k(X 2 ) 

for some function k(-), which in turn implies that 
9 = 2^1^}. 
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